Unsupervised Discovery of Relations and Discriminative Extraction Patterns

نویسندگان

  • Alan Akbik
  • Larysa Visengeriyeva
  • Priska Herger
  • Holmer Hemsen
  • Alexander Löser
چکیده

Unsupervised Relation Extraction (URE) is the task of extracting relations of a priori unknown semantic types using clustering methods on a vector space model of entity pairs and patterns. In this paper, we show that an informed feature generation technique based on dependency trees significantly improves clustering quality, as measured by the F-score, and therefore the ability of the URE method to discover relations in text. Furthermore, we extend URE to produce a set of weighted patterns for each identified relation that can be used by an information extraction system to find further instances of this relation. Each pattern is assigned to one or multiple relations with different confidence strengths, indicating how reliably a pattern evokes a relation, using the theory of Discriminative Category Matching. We evaluate our findings in two tasks against strong baselines and show significant improvements both in relation discovery and information extraction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effective Selectional Restrictions for Unsupervised Relation Extraction

Unsupervised Relation Extraction (URE) methods automatically discover semantic relations in text corpora of unknown content and extract for each discovered relation a set of relation instances. Due to the sparsity of the feature space, URE is vulnerable to ambiguities and underspecification in patterns. In this paper, we propose to increase the discriminative power of patterns in URE using sele...

متن کامل

Unsupervised Feature Selection for Relation Extraction

This paper presents an unsupervised relation extraction algorithm, which induces relations between entity pairs by grouping them into a “natural” number of clusters based on the similarity of their contexts. Stability-based criterion is used to automatically estimate the number of clusters. For removing noisy feature words in clustering procedure, feature selection is conducted by optimizing a ...

متن کامل

Unsupervised Parsing for Generating Surface-Based Relation Extraction Patterns

Finding the right features and patterns for identifying relations in natural language is one of the most pressing research questions for relation extraction. In this paper, we compare patterns based on supervised and unsupervised syntactic parsing and present a simple method for extracting surface patterns from a parsed training set. Results show that the use of surfacebased patterns not only i...

متن کامل

URES : an Unsupervised Web Relation Extraction System

Most information extraction systems either use hand written extraction patterns or use a machine learning algorithm that is trained on a manually annotated corpus. Both of these approaches require massive human effort and hence prevent information extraction from becoming more widely applicable. In this paper we present URES (Unsupervised Relation Extraction System), which extracts relations fr...

متن کامل

Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods

As the amount of electronic documents (corpora, dictionaries, newspapers, newswires, etc.) becomes more and more important and diversified, there is a need to extract information automatically from these texts. In order to extract terms and relations between terms, two methods can be used. The first method is the unsupervised approach, which requires a term extraction module and few predefined ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012